Back

Contents

Probability and statistics cheat sheet

Sample

Mean

"the arithmetic average of $n$ R.V.s. from a random sample of size $n$

$$ \bar{x} = \dfrac{1}{n} \sum_{i=1}^{n} x_i $$

or

"the arithmetic average of the realisation of $n$ R.V.s".

$$ \bar{X} = \dfrac{1}{n} \sum_{i=1}^{n} X_i $$

where $\bar{X}$ is an R.V. with an expectation and variance.

Variance

$$ \sigma^2 = \frac{\displaystyle\sum_{i=1}^{n}(x_i - \mu)^2} {n} $$

Expectation

$$ E(\bar{X_n}) = \mu $$

Population

Mean

$$ \bar{X} = \dfrac{1}{N} \sum_{i=1}^{N} x_i $$

Variance

$$ \sigma^2 = \frac{\displaystyle\sum_{i=1}^{N}(x_i - \mu)^2} {N-1} $$

PDF/CDF

$f_X(x)$ is the PDF of the R.V. X.

$F_X(x)$ is the CDF of the R.V. X.

$$ f_X(x) = \frac{d}{dx}F_X(x) $$

Expectation

$$ E(X) = \int xf_X(x) $$

$$ E(aX + b) = aE(X) + b $$

$$ \textrm{Var}(aX + b) = a^2E(X) $$

Conditional

$$ E(Y|X) = \int yf_{(Y|X)}(y|x) $$

where $E(Y|X)$ is a random variable.

Law of iterated expectations

$$ E(E(Y|X)) = E(Y) $$

Law of iterated variances

$$ \textrm{Var}(E(Y|X)) + E(\textrm{Var}(Y|X)) = \textrm{Var}(Y) $$

$$ \textrm{Var}(X) = E(X^2)-[E(X)^2] $$

Distributions

Bernoulli

"one trial, two possible outcomes with probability of success, $p$."

Expectation

$$ E(X) = p $$

Variance

$$ \textrm{Var}(X) = pq $$

Binomial

"$n$ trials with $k$ successes given a probability of success, $p$."

For small $p$, the binomial distribution simulates the Poisson.

How many combinations, "$n$ choose $k$"?

$$ {n \choose k} = \frac{n!}{k!(n-k)!} $$

Expectation

$$ E(X) = np $$

Variance

$$ \textrm{Var}(X) = np(1-p) $$

Probability of $k$ successes in $n$ trials

Think: the number of combinations $\times$ the probability of a single combination occurring.

$$ = {n \choose k}p^{k}(1-p)^{n-k} $$

Hypergeometric

Probability $X = k$

$$ P(X = k) = \dfrac{{K \choose k}{N-k \choose n-k}}{N \choose n} $$

where:

Geometric

"number of trials, $n$, until success given a probability, $p$".

Expectation

$$ E(X) = 1/p $$

Variance

$$ \textrm{Var}(X) = \dfrac{(1-p)}{p^{2}} $$

Probability of success on $n$th attempt

$$ = \dfrac{1-p}{p^2} $$

Poisson

$N_{t}$ is an integer-valued R.V. corresponding to the number of events.

$\lambda = \gamma t$ where $\gamma$ is the propensity to arrive per unit of time, and $t$ is a number in units of time.

Expectation

$$ E(N_{t}) = \lambda $$

Variance

$$ \textrm{Var}(N_{t}) = \lambda $$

Probability of $k$ events

$$ P(N_{t} = k) = \dfrac{\lambda^{k}e^{-\lambda}}{k!} $$

Exponential

Waiting time between two events in a Poisson process:

$$ f_{x} = \lambda e^{-\lambda x} $$

if $x>0$.

Distribution is memoryless:

$$ P(X\geq t = e^{-\lambda t}) $$

Expectation

$$ E(X) = \dfrac{1}{\lambda} $$

Variance

$$ \textrm{Var}(X) = \dfrac{1}{\lambda^{2}} $$

Uniform

This distribution is continuous.

Expectation

$$ E(X) = \dfrac{(a+b)}{2} $$

Variance

$$ \textrm{Var}(X) = \dfrac{(b-a)^{2}}{12} $$

Normal

$$ f(x) = \dfrac{1}{{\sigma\sqrt{2\pi}}} e^{-(x - \mu)^{2}/(2\sigma^{2}) } $$

Taking a linear transformation of a normally-distributed R.V. generates a normally-distributed random variable.

Covariance ($\sigma_{XY}$)

Some basic relations:

$$ \textrm{Cov}(X,Y) = \sigma_{XY} = E[(x-\mu_x)(y-\mu_y)] $$

$$ \textrm{Cov}(X,X) = Var(X) $$

$$ \textrm{Cov}(X,Y) = E(XY)-E(X)E(Y) $$

$$ \textrm{Cov}(aX+b,cY+d) = acCov(X,Y) $$

$$ Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y) $$

Correlation ($\rho_{XY}$)

$$ Corr(X,Y) = \rho_{XY} = \dfrac{\textrm{Cov}(X,Y)}{(\sqrt(Var(X))*\sqrt(Var(Y)))} $$

Markov Inequality

For any positive R.V.,

$$ P(X\geq t) \leq \frac{E(X)}{t} $$

for $t>0$

Chebyshev Inequality

$$ P(|X-E(X)| \geq t) \leq Var(X)/t^2 $$

for $t>0$

Central Limit Theorem (CLT)

"the definition of the CDF for the standardised version of the sample mean from any distribution, where the sample size $n$ tends to $\infty$, is that of the standard normal"

$$ \lim_{n\to\infty} P\left[ \dfrac{\sqrt{n}(\bar{X}-\mu)}{\sigma} \leq x\right] = \Phi(x) $$

or, in other words:

"CLT implies that were one to draw $n$ samples $X_1$,...,$X_n$ independently and identically, then for reasonably large $n$ each $X_i$ need not be approximately normally distributed, but the sample mean $\bar{X} = \sum_i X_i/n$ will be approximately normally distributed."

Estimators

$\hat{\theta}$ is an estimator for the variable, $\theta$.

An estimator is efficient if the spread, Var($\hat{\theta}$), of the estimator is small.

An estimator is robust if it is resilient to errors arising from misspecification the underlying distribution of $\theta$.

Mean Squared Error (MSE)

MSE($\hat{\theta}$) = Var($\hat{\theta}$) + $\left[ E(\hat{\theta}) - \theta\right] ^{2}$

Variance

For an estimator, e.g. the sample mean $\bar{X_n}$:

$$ \textrm{Var}(\bar{X_n}) = \dfrac{\sigma^{2}}{n} $$

Standard Error (SE)

$$ \textrm{SE}(\bar{X_n}) = \dfrac{\sigma}{\sqrt{n}} $$

References

  1. https://www.edx.org/course/data-analysis-for-social-scientists (MOOC)
  2. https://www.youtube.com/watch?v=JNm3M9cqWyc&feature=youtu.be (CLT)

Top